This script takes a deep dive into Landsat 9 labels for a more rigorous analysis of inconsistent band data and outliers in the filtered label dataset. Here we will determine if any more label data points should be removed from the training dataset and whether or not we can glean anything from the metadata in the outlier dataset to be able to pre-emptively toss out scenes when we go to apply the classification algorithm.
harmonize_version = "v2024-04-17"
outlier_version = "v2024-04-17"
LS9 <- read_rds(paste0("data/labels/harmonized_LS89_labels_", harmonize_version, ".RDS")) %>%
filter(mission == "LANDSAT_9")
Just look at the data to see consistent (or inconsistent) user-pulled data and our pull, here, our user data are in “BX” format and the re-pull is in “SR_BX” format. These are steps to assure data quality if the volunteer didn’t follow the directions explicitly.
pmap(.l = list(user_band = LS89_user,
ee_band = LS89_ee,
data = list(LS9),
mission = list("LANDSAT_9")),
.f = make_band_comp_plot)
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
##
## [[5]]
##
## [[6]]
There isn’t a ton of mis-match here, we’ll just use B7/SR_B7 as a reference to filter inconsistent labels
LS9_inconsistent <- LS9 %>%
filter((is.na(SR_B7) | SR_B7 != B7))
LS9_inconsistent %>%
group_by(class) %>%
summarise(n_labels = n()) %>%
kable()
| class | n_labels |
|---|---|
| cloud | 1 |
| lightNearShoreSediment | 1 |
| offShoreSediment | 1 |
Not much to report, we can just drop these few inconsistent band values.
LS9_filtered <- LS9 %>%
filter(# filter data where SR_B7 has data and where the values match between the two
# pulls.
(!is.na(SR_B7) & SR_B7 == B7) |
# or where the user-specified class is cloud and the pixel was saturated
# providing no surface refelctance data in the re-pull
(class == "cloud" & is.na(SR_B7)),
# or where any re-pulled band value is greater than 1, which isn't a valid value
if_all(LS89_ee,
~ . <= 1))
And plot:
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
##
## [[5]]
##
## [[6]]
Looks good!
And now let’s look at the data by class:
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
##
## [[5]]
##
## [[6]]
We aren’t actually modeling “other” (not sufficient observations to classify) or “shorelineContamination” (we’ll use this later to block areas where there is likely shoreline contamination in the AOI).
LS9_for_class_analysis <- LS9_filtered %>%
filter(!(class %in% c("other", "shorelineContamination")))
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
##
## [[5]]
##
## [[6]]
Interesting - the classes look really similar in distribution (maybe because cloud categories are so high). It will be interesting to see if there are statistical differences.
There are statistical outliers within this dataset and they may impact the interpretation of any statistical testing we do. Let’s see if we can narrow down when those outliers and/or glean anything from the outlier data that may be applicable to the the application of the algorithm. Outliers may be a systemic issue (as in the scene is an outlier), it could be a user issue (a user may have been a bad actor), or they just might be real. This section asks those questions. The “true outliers” that we dismiss from the dataset will also be used to help aid in interpretation/application of the algorithm across the Landsat stack, so it is important to make notes of any patterns we might see in the outlier dataset.
## [1] "Classes represented in outliers:"
## [1] "lightNearShoreSediment" "offShoreSediment" "openWater"
Okay, 25 outliers (>1.5*IQR) out of 448 - and they are all from non-cloud groups, and non of them are light near shore sediment.
How many of these outliers are in specific scenes?
LS9_out_date <- outliers %>%
group_by(date, vol_init) %>%
summarize(n_out = n())
LS9_date <- LS9_for_class_analysis %>%
filter(class != "cloud") %>%
group_by(date, vol_init) %>%
summarise(n_tot = n())
LS9_out_date <- left_join(LS9_out_date, LS9_date) %>%
mutate(percent_outlier = n_out/n_tot*100) %>%
arrange(-percent_outlier)
LS9_out_date %>%
kable()
| date | vol_init | n_out | n_tot | percent_outlier |
|---|---|---|---|---|
| 2022-06-06 | KRW | 7 | 17 | 41.176471 |
| 2022-05-21 | MRB | 12 | 83 | 14.457831 |
| 2021-11-06 | KRW | 2 | 30 | 6.666667 |
| 2022-05-05 | HAD | 4 | 118 | 3.389830 |
There is just one scene with a high proportion of ouliers - perhaps there is something about the AC in these particular scenes? or the general scene quality?
LS9_out_date %>%
filter(percent_outlier > 20) %>%
select(date, vol_init) %>%
left_join(., LS9) %>%
select(date, vol_init, DATA_SOURCE_AIR_TEMPERATURE:max_cloud_cover) %>%
distinct() %>%
kable()
| date | vol_init | DATA_SOURCE_AIR_TEMPERATURE | DATA_SOURCE_ELEVATION | DATA_SOURCE_OZONE | DATA_SOURCE_PRESSURE | DATA_SOURCE_REANALYSIS | DATA_SOURCE_TIRS_STRAY_LIGHT_CORRECTION | DATA_SOURCE_WATER_VAPOR | NADIR_OFFNADIR | CLOUD_COVER_list | IMAGE_QUALITY_OLI_list | IMAGE_QUALITY_TIRS_list | mean_cloud_cover | max_cloud_cover |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2022-06-06 | KRW | MODIS | GLS2000 | MODIS | Calculated | GEOS-5 FP-IT | MODIS | NADIR | 53.08; 27.14 | 9 | 9 | 40.11 | 53.08 |
Image quality is high and cloud cover is reasonable. Let’s look at that image:
This does indeed look a bit hazy, but there are some really interesting offshore sediment plumes here. Hard to tell if the color is in any way due to aerosol contamination, though.
Let’s look at instances where outliers are in at least three bands for a given label:
| date | class | vol_init | user_label_id | n_bands_out | bands_out |
|---|---|---|---|---|---|
| 2022-06-06 | lightNearShoreSediment | KRW | 114 | 3 | SR_B5; SR_B6; SR_B7 |
| 2022-06-06 | offShoreSediment | KRW | 125 | 3 | SR_B2; SR_B5; SR_B6 |
These are from the same scene. Let’s see if it shows up in our QA summaries.
Do any of the labels have QA pixel indications of cloud or cloud shadow? The first pass here is for all data that don’t have a label of “cloud” (not just outliers). Let’s see if the low certainty classification in the QA band is useful here (there is no medium certainty for LS8/9)
LS9_for_class_analysis %>%
mutate(QA = case_when(str_sub(QA_PIXEL_binary, 1, 2) %in% c(01, 11) ~ "cirrus",
str_sub(QA_PIXEL_binary, 3, 4) %in% c(01, 11) ~ "snow/ice",
str_sub(QA_PIXEL_binary, 5, 6) %in% c(01, 11) ~ "cloud shadow",
str_sub(QA_PIXEL_binary, 7, 8) %in% c(01, 11) ~ "cloud",
TRUE ~ "clear")) %>%
group_by(QA) %>%
filter(class != "cloud") %>%
summarize(n_tot = n()) %>%
kable()
| QA | n_tot |
|---|---|
| cirrus | 8 |
| clear | 243 |
| cloud shadow | 1 |
| snow/ice | 1 |
Low confidence is much better than medium confidence in LS8 than 5/7 - let’s check to see that the classes are the same for high confidence:
LS9_for_class_analysis %>%
mutate(QA = case_when(str_sub(QA_PIXEL_binary, 1, 2) == 11 ~ "cirrus",
str_sub(QA_PIXEL_binary, 3, 4) == 11 ~ "snow/ice",
str_sub(QA_PIXEL_binary, 5, 6) == 11 ~ "cloud shadow",
str_sub(QA_PIXEL_binary, 7, 8) == 11 ~ "cloud",
TRUE ~ "clear")) %>%
group_by(QA) %>%
filter(class != "cloud") %>%
summarize(n_tot = n()) %>%
kable()
| QA | n_tot |
|---|---|
| cirrus | 8 |
| clear | 243 |
| cloud shadow | 1 |
| snow/ice | 1 |
They are the same! Let’s look at the cirrus group to see if there is anything egregious:
LS9_for_class_analysis %>%
filter(str_sub(QA_PIXEL_binary, 1, 2) == 11,
class != "cloud") %>%
group_by(date, vol_init) %>%
summarise(n_cirrus = n()) %>%
arrange(-n_cirrus) %>%
kable()
| date | vol_init | n_cirrus |
|---|---|---|
| 2022-05-05 | HAD | 8 |
Let’s look at this scene - there are 8 labels (that aren’t clouds) that have a cirrus flag:
This is quite hazy in the SE corner of the AOI, near the Apostles. While we can keep this scene since it is otherwise clear, we will again trust the QA bit to remove high aerosol pixels.
How many of these outliers have near-pixel clouds (as measured by ST_CDIST)?
There are 0 labels (0% of oultiers) that aren’t “cloud” in the outlier dataset that have a cloud distance <500m and 19 labels (4.2%) in the whole dataset that have a cloud distance <500m. This is unhelpful.
How many of the outliers have high cloud cover, as reported by the scene-level metadata? Note, we don’t have the direct scene cloud cover associated with individual labels, rather a list of the scene level cloud cover values associated with the AOI.
The outlier dataset contains 12 (48%) where the max cloud cover was > 75% and 12 (48%) where the mean cloud cover was > 50%. The filtered dataset contains 83 (18.5%) where max was >75% and 83 (18.5%) where the mean cloud cover was > 50%. This is also unhelpful.
Pixels can also be saturated in one or more bands, we need to make sure that the QA_RADSAT for all labels (including clouds) are set to zero. During the re-pull, we masked satruated pixels, so this should be zero.
LS9_for_class_analysis %>%
mutate(radsat = if_else(QA_RADSAT == 0,
"n",
"y")) %>%
group_by(radsat) %>%
summarize(n_tot = n()) %>%
kable()
| radsat | n_tot |
|---|---|
| n | 448 |
Great! No bands are saturated!
Landsat 9 and 9 feature an Aerosol QA band, derived from Band 1. We should look through the data here to see if any of the labels are in high aerosol QA pixels, which the USGS suggests should not be used.
LS9_for_class_analysis %>%
mutate(aerosol = if_else(str_sub(SR_QA_AEROSOL_binary, 1, 2) == 11,
"y",
"n")) %>%
group_by(aerosol) %>%
filter(class != "cloud") %>%
summarize(n_tot = n()) %>%
kable()
| aerosol | n_tot |
|---|---|
| n | 167 |
| y | 86 |
And let’s look to see when the instances of high aerosol are:
LS9_for_class_analysis %>%
filter(str_sub(SR_QA_AEROSOL_binary, 1, 2) == 11) %>%
group_by(date) %>%
filter(class != "cloud") %>%
summarize(n_tot = n()) %>%
arrange(-n_tot) %>%
kable()
| date | n_tot |
|---|---|
| 2022-05-05 | 59 |
| 2022-05-21 | 20 |
| 2022-06-06 | 7 |
We already know that the scenes 2022-05-05 and 2022-06-06 have been flagged, let’s also look at 2022-05-21:
Well, some of the scene is pretty clear, but there is some of the green-cloud artifact of high aerosol evident in this scene.
I’m also interested in the scene quality here:
LS9_for_class_analysis %>%
filter(date == "2022-05-21") %>%
pluck("IMAGE_QUALITY_OLI_list") %>%
unique() %>%
unlist()
## [1] "9"
As with other instances, IMAGE QUALITY is high. Since there is quite a bit of this scene that is clear, I’m again going to trust the QA bits.
For the purposes of training data, we are going to throw out the the high aerosol and cirrus flags
LS9_training_labels <- LS9_for_class_analysis %>%
# drop the labels with cirrus contamination and high aerosol
filter(!(str_sub(QA_PIXEL_binary, 1, 2) == 11),
# drop all the labels with high aerosol, unless the class is cloud
(str_sub(SR_QA_AEROSOL_binary, 1, 2) != 11 | class == "cloud"))
We do want to have an idea of how different the classes are, in regards to band data. While there are a bunch of interactions that we could get into here, for the sake of this analysis, we are going to analyze the class differences by band.
Kruskal-Wallis assumptions:
ANOVA assumptions:
We can’t entirely assert sample independence and we know that variance and distribution is different for “cloud” labels, but those data also are visibly different from the other classes.
In order to systematically test for differences between classes and be able to intepret the data, we will need to know some things about our data:
With this workflow, most classes are statistically different - below are the cases where the pairwise comparison were not deemed statistically significant:
## # A tibble: 20 × 9
## band group1 group2 n1 n2 statistic p p.adj p.adj.signif
## <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
## 1 SR_B2 darkNearShoreSe… light… 28 65 1.71 0.0876 0.876 ns
## 2 SR_B2 darkNearShoreSe… offSh… 28 54 -1.47 0.141 1 ns
## 3 SR_B2 darkNearShoreSe… openW… 28 20 -2.55 0.0106 0.106 ns
## 4 SR_B2 offShoreSediment openW… 54 20 -1.55 0.122 1 ns
## 5 SR_B3 darkNearShoreSe… light… 28 65 1.56 0.118 1 ns
## 6 SR_B3 darkNearShoreSe… offSh… 28 54 -2.20 0.0279 0.279 ns
## 7 SR_B3 offShoreSediment openW… 54 20 -2.15 0.0312 0.312 ns
## 8 SR_B4 darkNearShoreSe… light… 28 65 0.883 0.377 1 ns
## 9 SR_B4 offShoreSediment openW… 54 20 -1.89 0.0594 0.594 ns
## 10 SR_B5 darkNearShoreSe… light… 28 65 0.644 0.519 1 ns
## 11 SR_B5 darkNearShoreSe… openW… 28 20 -2.30 0.0217 0.217 ns
## 12 SR_B5 offShoreSediment openW… 54 20 0.315 0.753 1 ns
## 13 SR_B6 darkNearShoreSe… light… 28 65 0.350 0.726 1 ns
## 14 SR_B6 darkNearShoreSe… openW… 28 20 -1.59 0.111 1 ns
## 15 SR_B6 lightNearShoreS… openW… 65 20 -2.13 0.0329 0.329 ns
## 16 SR_B6 offShoreSediment openW… 54 20 0.873 0.383 1 ns
## 17 SR_B7 darkNearShoreSe… light… 28 65 0.315 0.753 1 ns
## 18 SR_B7 darkNearShoreSe… openW… 28 20 -1.40 0.160 1 ns
## 19 SR_B7 lightNearShoreS… openW… 65 20 -1.89 0.0593 0.593 ns
## 20 SR_B7 offShoreSediment openW… 54 20 0.981 0.326 1 ns
This is arguably worse than LS8. I think part of the issue for 8 and 9 may be a quantity issue.
DNSS: dark near shore sediment, LNSS: light near shore sediment, OSS: offshore sediment
There are definitely some varying patterns here, let’s zoom in on the sediment classes.
DNSS: dark near shore sediment, LNSS: light near shore sediment, OSS: offshore sediment
Arguably not as scatter shot bad as LS8, but still might be tough to distinguish between classes.
As a back up, we should consider using aggregated sediment classes, where any label of sediment is treated as a general class of “sediment”. Let’s do the same process here to test for class significance.
## # A tibble: 3 × 9
## band group1 group2 n1 n2 statistic p p.adj p.adj.signif
## <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
## 1 SR_B5 openWater sediment 20 147 1.93 0.0539 0.162 ns
## 2 SR_B6 openWater sediment 20 147 1.03 0.302 0.906 ns
## 3 SR_B7 openWater sediment 20 147 0.827 0.408 1 ns
And let’s look at the scatter plots here:
And if we drop the cloud:
LS9_training_labels %>%
filter(agg_class != "cloud") %>%
ggpairs(., columns = LS89_ee, aes(color = agg_class)) +
scale_color_colorblind() +
scale_fill_colorblind() +
theme_few()
These seem pretty recognizable in visible band space.
Things to note for Landsat 9:
write_rds(LS9_training_labels, paste0("data/labels/LS9_labels_for_tvt_", outlier_version, ".RDS"))